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1  The  Sanctuary  Project 

The  Sanctuary  project  investigated  the  engineering  fundamentals  of  mobile  code  security.  We  believe  that  mobile 
code  will  become  a  more  attractive  component  of  a  distributed  system  designer’s  toolbox  as  an  inevitable  result  of 
technology  trends.  Hardware  trends  continue  to  make  computational  resources  ever  cheaper  and  more  plentiful,  while 
the  constraints  of  geographic  separation  remain  essentially  fixed.  Long-distance  communication  latencies  will  become 
a  critical  bottleneck.  Co-locating  code  and  resources  reduces  communication  latency  and  increases  overall  system 
throughput. 

Often,  the  intended  goal  of  using  mobile  code— to  make  distributed  systems  operate  more  efficiently— seems  to 
become  unreachable  once  the  realities  of  widely  distributed  systems  is  considered.  The  computation  resources  in  these 
systems  belong  to  disparate  security  domains,  and  trusting  that  the  remote  execution  of  some  code  occurred  correctly 
requires  a  tremendous  leap  of  faith. 

1.1  Goals 

The  goals  of  the  Sanctuary  project  is  to  investigate  the  security  issues  that  arise  when  mobile  code  is  used  as  a 
component  of  a  distributed  system. 

The  security  problem  examined  decomposes  into  two  main  areas:  (1)  server  security  and  (2)  mobile  code  (or 
mobile  agent)  security.  For  the  former,  the  assets  being  protected  are  the  resources  accessible  at  a  host  for  mobile 
code,  as  well  as  resources  carried  by  mobile  agents  from  one  user  or  principle  that  should  be  partitioned  from  access 
by  another.  The  host  should  provide  fine-grained  controlled  access  to  these  resources,  and  of  course  provide  the 
necessary  basic  support  for  mobile  code. 

The  latter  problem  is  more  fundamental  in  nature  and  variations  of  it  has  been  explored  earlier  as  copy  protection 
[11],  trusted  computing  [6,  25],  or  the  secure  remote  execution  problem  [3,  7].  This  can  be  subdivided  into  two 
security  properties  for  computation  analogous  to  that  for  communication  security:  (1)  computational  confidentiality, 
and  (2)  computational  integrity.  The  former  refers  to  being  able  to  compute  a  function  remotely  without  the  owner  of 
the  remote  machine  hosting  the  computation  being  able  to  determine  what  that  remote  code  is  computing.  The  latter 
refers  to  being  able  to  trust  that  the  result  of  the  remotely  executing  code  had  not  been  tampered  with,  that  the  result 
would  be  identical  if  the  computation  had  occurred  on  a  trusted  secure  machine. 


2  Accomplishments 

The  Sanctuary  project’s  investigation  in  mobile  code  security  resulted  in  advances  in  several  areas. 

We  discovered  and  proved  correct,  in  the  concrete  security  setting,  a  simple  way  to  implemented  symmetric  key 
cryptographic  schemes  with  the  forward  security  property  [23, 4].  By  using  forward-secure  cryptography,  we  enabled 
multi-hop  mobile  agents  to  commit  to  partial  results  determined  prior  to  visiting  any  malicious  hosts,  thereby  making 
it  computationally  impossible  for  those  malicious  hosts  to  tamper  with  or  otherwise  falsify  the  earlier  partial  results. 
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We  investigated  techniques  to  factor  out  support  for  process  migration  from  the  mobile  code  system,  enabling  the 
use  of  a  simpler,  more  modular  architecture. 

We  discovered  and  investigated  techniques  to  monitor  remotely  executing  processes,  as  well  as  secure  specula¬ 
tive/predictive  remote  execution  (S2PRE),  enabling  the  use  of  mobile  code  on  remote,  untrusted  machines  to  improve 
distributed  system  throughput  without  compromising  computational  integrity  [24],  While  this  technique  does  not  pro¬ 
vide  any  computational  confidentiality,  it  allows  us  to  use  mobile  code  to  speed  up  remote  procedure  calls  (RPCs)  so 
that  the  RPCs  appear  to  have  the  same  latency  as  with  full  resource  co-location. 

2.1  Mobile  Code  Execution  Environments 

We  expended  a  great  deal  of  effort  in  designing  and  building  a  secure  mobile  code  execution  environment.  Here,  the 
primary  goal  was  to  ensure  that  rogue  mobile  code  cannot  compromise  the  security  of  the  hosting  service.  Doing  this 
enabled  us  to  examine  closely  the  interrelationship  between  mobile  code  security  and  server  security,  and  to  build  in 
support  within  the  server  needed  to  make  the  implementation  of  mobile  code  security  techniques  feasible. 

Matthew  Hohlfeld  designed  and  implemented  the  Sanctuary  server,  a  secure  mobile  code  execution  environment. 
He  focussed  primarily  on  server  security  but  also  on  providing  the  tools/hooks  needed  to  address  agent  security  [8]. 

At  first  approximation,  the  mobile  code  server  is  essentially  a  Java-based  sandbox  that  controls  access  to  local 
resources.  It  provides  a  rich  interface  that  goes  far  beyond  simply  controlling  local  file  or  network  access.  The  server 
provides  interfaces  that  enable  local  service  to  use  a  secure  inter-agent  communication  mechanism  to  authenticate, 
authorize,  and  account  for  use  of  access-restricted  resources.  For  specialized  resources  such  as  cryptographic  keys 
that  mobile  code  might  carry,  it  also  provides  protocol  coordination  to  ensure  that  process  migration  is  coordinated 
with  securely  transfering  cryptographic  keys. 

Another  class  of  resource  that  is  carried  by  migrating  processes  is  communication  endpoints.  This  work  was  the 
core  of  Juliana  Wong’s  Masters  thesis.  She  designed  and  built  a  secure  network  interprocess  communication  system 
(IPC)  for  mobile  agents  [21],  patterned  largely  on  the  IPC  mechanism  of  the  Mach  operating  system  [2]. 

In  our  mobile  code  system,  the  mobile  agents  carry  with  them  access  rights  to  ports,  which  can  be  thought  of  as 
message  queues.  Associated  with  each  port  is  exactly  one  extant  receive  right  but  there  may  be  many  send  rights  as  well 
as  send-once  rights.  We  apply  a  simple  cryptographic  technique— key  splitting— to  associate  unique  cryptographic 
keys  with  each  send  or  send-once  right;  this  enables  us  to  apply  message  authentication  codes  to  messages  so  that  not 
only  would  a  host  compromise  have  to  occur  before  forged  messages  may  be  created,  but  agents  (or  their  hosts)  that 
receive  a  send  right  could  only  forge  messages  to  appear  to  originate  from  a  source  below  it  in  the  derivation  tree,  i.e., 
a  send  (or  send-once)  right  that  derived  from  the  send  right  held  by  the  compromised  agent  (or  host). 

The  expected  frequency  at  which  mobile  agents  migrate  has  a  marked  impact  on  the  design  of  our  distributed 
mobile-agent  IPC  mechanism.  If  the  agent  holding  the  receive  rights  to  a  port  migrates,  the  (remote)  agents  holding 
send  rights  must  either  be  notified  of  the  new  location  of  the  port,  or  the  host  which  the  agent  is  leaving  must  remain 
available  and  be  able  to  forward  messages.  In  this  work,  such  a  host  only  temporarily  forwards  messages  and  can 
stop  at  any  time.  If  a  message  arrives  which  should  be  forwarded,  it  is  simultaneously  forwarded  and  a  path  update 
is  sent  to  the  receiver  this  “forwarding  path  compression”  updates  frequent  sender  with  the  current  location  of  port 
receive  rights.  If  the  forwarder  has  been  rebooted  or  has  garbage  collected  the  forwarding  information,  it  sends  a 
failure  message  and  the  sender  must  use  a  “home  agent  server”  (akin  to  that  of  a  “home  agent”  in  mobile  IP)  to  enable 
the  discovery  of  a  port’s  current  location  should  the  location  update  fail. 

2.2  Forward-Secure  Cryptography 

To  enable  secure  mobile  agents  to  detect  tampering  with  partial  results  arrived  at  earlier  servers,  we  use  forward- 
secure  symmetric-key  cryptography  [4]  to  generate  forward-secure  authentication  codes.  The  use  of  forward-secure 
cryptography  in  practice  is  complicated  by  the  need  to  integrate  the  key  derivation  and  transfer  with  the  task  of  mobile 
agent  migration. 

Aditya  Ojha  investigated  how  to  support  forward-secure  cryptography  in  our  Java-based  mobile  code  framework  in 
his  Masters  thesis  work.  He  implemented  a  Java  cryptography  library  using  Java  Native  Interface  (JNI)  that  provides 
strong  forward  security  properties  [13].  This  is  needed  because  the  Java  memory  model  makes  it  impossible  to  ensure 
that  cryptographic  keys  can  be  securely  deleted.  By  using  JNI  to  access  C  code,  mobile  agents  are  assured  that  if  they 
are  migrating  away  from  a  uncompromised  server,  then  all  traces  of  cryptographic  keys  will  have  been  deleted  once 
they  arrive  at  the  new  server. 
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Forward-secure  cryptography  support  and  its  integration  with  the  mobile  agent  server  is  critical  to  supporting 
secure  mobile  code. 

2.3  Programming  Language  Support  for  Migrating  Processes 

Simplicity  of  design  and  modularization  enables  us  to  more  clearly  reason  about  security.  Rather  than  trying  to  capture 
the  state  of  a  running  Java  program  by  changing  the  low-level  guts  of  the  Java  Virtual  Machine  (JVM)  implementation 
[15],  we  chose  to  relieve  the  mobile  agent  server  from  any  such  responsibility.  Instead,  we  move  the  complexity 
outside  of  the  JVM  and  use  a  special  compiler  for  a  special  dialect  of  Java  and  library  code  to  implement  process 
migration. 

Our  Java  dialect  provides  support  for  continuations  as  first-class  objects  and  process  checkpointing  and  migration  is 
implemented  as  a  library  package  on  top  of  these  continuations.  (We  also  implemented  user-level  threads  in  Java  using 
continuations;  since  our  dialect  do  not  support  multithreaded  agents,  this  is  actually  useful!)  Rahul  Lahoti  designed 
and  implemented  a  compiler  that  accepts  this  dialect  and  outputs  standard  Java  source  code  as  his  Master’s  thesis  work 
[9,  12].  Because  the  compilation  process  involves  a  series  of  simple,  highly  stylized  parse  tree  transformations,  the 
correctness  of  the  compiler  is  amenable  to  code  inspection  and  thorough  testing. 

We  believe  that  this  approach  of  factoring  out  migration  from  the  virtual  machine  makes  it  far  easier  to  secure 
the  system.  The  interfaces  to  the  agent  server  remains  simple,  and  the  compiler  can  be  easily  tested  in  a  completely 
independent  manner  from  that  of  the  base  system.  Furthermore,  this  independence  insulates  us  from  changes  to  the 
JVM,  so,  for  example,  improvements  to  the  byte  code  compiler  is  readily  incorporated  into  the  system. 

2.4  Making  Secure  Migration  Decisions 

Having  a  simple  mechanism  to  migrate  an  agent  to  a  new  server  is  perhaps  a  necessary  condition  for  secure  migration, 
but  it  is  not  a  sufficient  one.  Clearly,  in  a  large-scale  distributed  mobile  code  environment  where  multiple  security 
domains  are  involved,  knowing  whether  the  agent  should  migrate  to  the  new  server  is  critical. 

To  this  end,  Poomaprajna  Udupi  investigated  security  attribute  certificates  and  implemented  software  used  to 
reason  about  trust  as  part  of  his  thesis  work  [20].  In  this  work,  we  use  attribute  certificates  to  attest  to  security  properties 
of  hosts  of  mobile  code.  The  security  properties  being  attested  to  may  range  from  the  result  of  recent  penetration  tests, 
a  system’s  Orange  Book  or  Common  Criteria  evaluation  [18]  or  FIPS  140-2  [19]  rating,  the  amount  of  insurance  / 
indemnification,  etc.  By  explicating  the  security  properties  and  permitting  mobile  code  to  make  decisions  based  on 
this  data,  we  enable  the  author  and  users  of  mobile  code  to  define  security  policies  to  make  explicit  tradeoffs  between 
the  efficiency  gains  that  would  accrue  from  using  a  remote  service  and  the  amount  of  risk  to  undertake. 

The  security  attribute  certification  infrastructure  design  is  based  on  that  of  SDSI/SPKI  [16,  5],  and  derives  much 
from  the  simplicity  and  flexibility  of  the  Lisp-like  syntax.  The  extensibility  implies  having  to  deal  with  a  wide  range 
of  attributes,  including  some  that  may  change  with  time.  The  security  attribute  certification  infrastructure  includes  an 
on-line  attestation  component  to  permit  relying  parties  to  ensure  that  a  given  attribute  certificate  has  not  been  revoked. 

2.5  Detecting  Remote  Tampering 

More  recently,  we  explored  approaches  that  detects  tampering  with  a  remotely  executing  agent  at  times  in  its  execution 
at  other  times  than  during  migration  or  after  the  agent  returns.  The  first  approach,  State  Transition  Inconsistency 
Detection  (STID),  uses  code  instrumentation  to  incorporate  state-transition  event  sensors  in  the  mobile  code  and  allows 
a  wide  spectrum  of  performance  versus  sensitivity  tradeoffs.  The  second  approach,  Secure  Speculative/Predictive 
Remote  Execution  (S2PRE),  assumes  the  ability  to  rollback  transactions  and  uses  remote  execution  as  a  prediction 
unit  co-located  with  remote  resources  to  essentially  eliminate  long-distance  RPC  roundtrip  latency.  This  scheme  runs 
a  program  twice— once  locally  on  trusted  servers,  and  once,  as  a  remote  prediction  unit,  co-located  with  the  remote 
resource— but  is  provably  completely  secure  with  respect  to  computational  integrity. 

As  part  of  her  Master’s  thesis  work,  Yekaterina  Tsipenyuk  investigated  how  to  monitor  remotely  executing  code 
to  ensure  that  its  integrity  of  execution  is  not  compromised  [17,  22],  We  explored  using  static  analysis  techniques 
to  identify  code  paths  in  which  state  transition  sensors  should  be  placed  and  to  identify  a  set  of  state  transitions  that 
are  monotonic  in  nature.  When  executed,  the  sensors  transmit  state  transition  events  to  a  monitor  process  at  a  trusted 
server,  and  the  monitor  uses  a  state  machine  model  to  detect  illegal  state  transitions.  Katrina  now  works  at  Fortify  Inc, 
a  startup  company  that  is  commercializing  using  static  analysis  to  detect  security  bugs  in  software. 


3 


t 


t 


t 


Client  computation  (trusted) 


McRPC  service  (trusted) 


Agent  computation  (not  trusted) 


To  investigate  S2PRE,  Scott  O’Neil  designed  and  implemented  modifications  and  extensions  to  Sun’s  Remote 
Method  Invocation  (RMI)  remote  procedure  call  system  [14].  On  the  first  Mobile  Code  RPC  (McRPC)  to  a  server, 
we  migrate  a  copy  of  the  program  to  a  mobile  code  server  (the  program’s  shadow)  co-located  with  (or  near)  the  RPC 
service  and  invoke  the  RPC  from  there.  The  RPC  response  and  the  RPC  input  parameters  are  signed  by  the  RPC  server 
and  sent  back  to  the  mobile  code  server,  which  copies  it  to  the  original  program.  The  original  program  verifies  that 
the  RPC  input  parameters  are  what  it  would  have  sent,  accepts  the  RPC  response  by  sending  a  “commit”  message  to 
the  RPC  server.  Any  future  McRPCs  will  be  take  place  in  a  lock-step  manner:  the  shadow  runs  slightly  ahead  of  the 
original  and  sends  the  RPC  inputs  to  the  RPC  server,  and  by  the  original  has  computed  the  RPC  inputs  the  forwarded 
RPC  response  will  have  arrived.  Figure  2.5  shows  how  a  McRPC  is  implemented  using  migration  and  normal  RPCs. 

Speculative  execution  using  a  redacted/simplified  version  of  the  program  remains  to  be  investigated.  Using  a 
redacted  version  of  the  program  would  trade  off  prediction  accuracy  with  (perhaps)  less  information  exposure— 
currently  there  is  no  computational  confidentiality  without  applying  other  security  techniques  (e.g.,  [1]).  Unfortu¬ 
nately,  none  of  these  other  techniques  are  sufficiently  efficient  and  would  destroy  the  lock-step  nature  of  S  2PRE. 

2.6  Support  for  Grid  Services 

Grid  services  were  first  developed  in  the  context  of  wide-area  scientific  computation  called  grid  computing.  In  many 
ways,  grid  computing  was  a  return  to  the  ideas  of  distributed  computing  that  were  studied  ten  or  more  years  ago:  a 
large  number  of  moderately  tightly  coupled  computation  nodes  providing  a  transparent  high  performance  platform. 
The  difference  is  that  grid  computing  was  doing  this  for  real:  the  platform  wasn’t  just  meant  for  evaluation  but  is  being 
used  for  production  purposes.  Hence,  many  issue  that  were  glossed  over  are  now  vital.  Some  of  the  first  issues  that 
needed  to  be  addressed  was  resource  management. 

At  the  same  time  that  the  computational  grid  was  developing,  web  services  were  becoming  widely  deployed.  Web 
services  are  widely  used  commercially  to  support  web  servers,  which  are  now  a  commercially  vital  infrastructure.  By 
growing  out  of  web  servers,  web  services  did  not  directly  address  distributed  computation.  The  idea  of  Grid  services 
was  thus  created:  to  combine  the  grid  computation  and  the  Web  services  models.  The  goal  is  to  create  a  platform  that 
supports  ambitious  multisite  applications  (now  often  called  ’’service  oriented  architecture”,  or  SOA). 

Important  concerns  for  Grid  services  are  security  and  fault  tolerance.  There  are  two  characteristics  of  Grid  services 
that  impact  how  fault  tolerance  is  best  provided: 

1)  Grid  services  run  on  high  level  protocols.  Being  so  high  on  the  stack  makes  it  hard  to  have  short  and  tight 
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upper  bounds  on  a  process’  responsiveness.  This  makes  some  simple  and  popular  approaches  based  on  perfect  failure 
detectors,  such  as  message  logging  and  primary-backup  protocols,  unsuitable. 

2)  Grid  services  were  first  developed  for  resource  scheduling.  Grid  scheduling  often  uses  randomized  algorithms 
to  balance  load  more  efficiently.  Nondeterminism  makes  providing  fault-tolerance  harder. 

Under  this  research  grant,  we  launched  research  into  mechanisms  for  services  with  the  above  characteristics.  This 
work  is  the  core  of  the  PhD  dissertation  of  Xianan  Zhang  (degree  estimated  in  Fall  2006). 

She’s  designed  these  mechanisms  under  some  practical  constraints:  the  performance  overhead  needed  to  be  low; 
reusing  existing  grid  mechanisms  is  highly  desirable;  adding  fault  tolerance  to  an  existing  grid  serice  should  be  simple 
-  ideally,  done  without  changing  any  code  on  the  server. 

This  work  was  based  on  [10].  Work  done  under  this  grant  is  presented  in  [26]  as  well  as  three  additional  papers, 
two  under  review  and  one  being  prepared. 
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